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1.  Introduction 


The  general  problem  considered  in  this  note  is  how  to  locate  a  known  object  from 
sensory  data,  especially  when  that  object  may  be  occluded  by  other  (possibly  un¬ 
known)  objects.  In  previous  work  Crimson  and  Lozano-Perez  84,  87]  we  described  a 
recognition  system,  called  RAF  (for  Recognition  and  Attitude  Finder),  that  identifies 
and  locates  objects  from  noisy,  occluded  data.  In  that  work,  we  concentrated  on  a 
particular  subclass  of  rigid  models.  If  the  sensory  data  provided  two-dimensional 
geometric  data,  for  example  intensity  edges  from  a  visual  image,  we  considered 
the  recognition  of  objects  that  consisted  of  sets  of  linear  segments,  or  equivalently, 
polygonal  objects  in  which  some  edges  are  not  included.  If  the  sensory  data  was 
three-dimensional,  we  considered  the  recognition  of  objects  that  consisted  of  sets  of 
planar  fragments,  or  equivalently,  polyhedral  objects  in  which  some  of  the  faces  are 
not  included. 

In  general,  of  course,  we  cannot  guarantee  that  the  recognition  system  will 
be  confronted  only  with  polyhedral  objects.  Since  the  RAF  system  is  reasonably 
insensitive  to  noise,  one  could  deal  with  curved  objects  by  simply  approximating 
them  with  polyhedral  models  that  are  required  to  deviate  from  the  actual  object 
by  no  more  than  some  bounded  amount.  This  has  the  effect  of  introducing  some 
additional  error  into  the  process,  which  the  system  has  been  able  to  tolerate.  While 
the  RAF  system  has  been  successfully  tested  on  a  range  of  real  data,  including  visual, 
laser,  sonar  and  tactile,  using  approximations  to  curved  objects  as  well  as  polyhedral 
ones  Crimson  and  Lozano-Perez  87;,  the  assumption  of  polyhedral  models  is  overly 
restrictive. 

In  particular,  one  of  the  difficulties  with  using  polyhedral  approximations  is 
that  they  are  not  stable,  so  that  several  images  of  the  same  object  may  lead  to 
different  approximations,  due  to  small  variations  in  the  imaging.  This  may  lead  to 
difficulties  in  matching,  either  causing  incorrect  match'0  or  removing  large  portions 
of  an  object  from  consideration.  Moreover,  systematic  errors  in  the  approximation 
can  have  serious  effects  on  the  recognition  process.  Consider  an  object  with  a  circular 
hole,  which  is  approximated  in  the  model  by  a  regular  polygon.  Now  suppose  that 
we  take  an  image  of  the  part  in  some  other  orientation,  and  extract  a  polygonal 
approximation  of  the  visible  edges  of  tin*  object.  The  boundary  of  the  hole  will 
again  be  approximated  by  a  regular  polygon.  If,  however,  the  approximations  are 
rotated  relative  to  one  another,  this  can  lead  to  a  drastic  error  in  locating  the  part, 
since  matching  the  two  descriptions  will  lead  to  a  large  error  in  the  orientation  of 
the  overall  part. 

In  this  note,  we  consider  the  problem  of  extending  the  method  to  deal  directly 
with  two  dimensional  objects  that  include  linear  and  curved  segments,  where  the 
curved  segments  can  be  approximated  by  circular  arcs.  To  describe  this  extension 
to  RAF.  w<>  must  specify  the  characteristics  of  (he  object  models,  the  requirements 
on  the  sensory  data,  and  the  search  technique  used  to  correctly  identify  the  object 
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mode!  from  the  sensory  data  Our  goal  is  to  obtain  a  system  that  ran  perform  as 
indicated  in  Figure  I. 
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shortly)  measured  in  a  local  coordinate  frame  specific  to  the  model.  A  solution  to 
the  recognition  problem  consists  of  determining  the  following  components:  a  subset 
of  the  sensory  data  fragments  that  are  believed  to  come  from  a  single  object;  an 
identification  of  which  object,  selected  from  the  library  of  known  objects;  the  model 
face  associated  with  each  data  fragment  in  the  subset;  and  finally  the  coordinate 
frame  transformation  that  maps  the  model  from  its  local  coordinate  frame  into  the 
sensor  coordinate  frame  in  such  a  manner  that  each  data  fragment  correctly  lies  on 
its  assigned  model  face.  In  more  formal  terms,  a  solution  is  a  three-tuple 

object,,  {{du  .  mu ),  [d^.m^), . . .  [d,k ,mu)},  (R,  v0, «)) 

where  object,  is  the  name  of  the  ith  object  in  the  library,  the  d,m  pairings  are 
associations  of  a  subset  of  the  sensory  data  d  with  model  faces  m  from  object,  and 
R  is  a  rotation  matrix,  vf,  is  a  translation  vector  and  s  is  a  scale  factor  such  that  a 
vector  vm  in  model  coordinates  is  transformed  into  a  vector  vd  in  sensor  coordinates 

by 

\d  -  sR\m  -+  t. 

As  has  been  described  elsewhere  [Crimson  and  Lozano-Perez  84,  87|,  we  ap¬ 
proach  the  recognition  problem  as  one  of  search.  Thus,  we  first  focus  on  finding 
legitimate  pairings  of  data  and  model  fragments,  for  some  subset  of  the  sensory 
data.  We  chose  to  structure  this  search  process  as  a  constrained  depth  first  search, 
using  an  interpretation  tree  (  IT).  Each  node  of  the  tree  describes  a  partial  interpre¬ 
tation  of  the  data,  and  implicitly  contains  a  set  of  pairings  of  data  fragments  and 
model  faces.  Nodes  at  the  first  level  of  the  tree  contain  assignments  for  the  first 
data  fragment,  nodes  at  the  second  level  contain  assignments  for  the  first  and  second 
data  fragments,  and  so  on.  Each  node  branches  at  the  next  level  in  up  to  n  +  1 
ways,  where  n  is  the  number  of  model  faces  in  the  object.  The  last  branch  is  a  wild 
card  or  null  branch  and  has  the  effect  of  excluding  the  data  fragment  corresponding 
to  the  current  level  of  the  tree  from  part  of  the  interpretation. 

fliven  .«  data  fragments,  any  leaf  of  the  tree  specif-ps  an  interpretation 
{(<fi ,  ),  ( d2,mj,  ),.  .  .(df.rnJm)}  , 

where  some  of  the  mJt  may  be  the  wild  card  character  By  excluding  such  matches, 
the  leaf  yields  a  partial  interpretation 

),(d,_,m3..), .  (d,k ,  m,J} 

where  1  <  i|  <  tj  *\  . . .  <  »*  but  these  indices  may  not  include  the  entire  set  from  1 
to  s  This  interpretation  may  then  be  used  to  solve’ for  a  rigid,  scaled  transformation 
that  maps  model  fares  into  corresponding  data  fragments,  if  such  a  transformation 
exists.  Thus,  by  searching  for  leaves  of  the  tree  and  testing  that  the  interpretation 
there  yields  a  legal  transformation,  we  ran  find  possible  instances  of  object  models 
in  the  flat  a  The  process  is  shown  in  Figure  2. 

Since  this  search  process  is  inherently  an  exponential  problem,  the  key  to  an 
efficient  solution  is  to  use  constraints  to  remove  large  subtrees  from  consideration 
w  ithout  having  explicitly  to  explore-  them.  We  next  consider  the  explicit  form  of  the 
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Interpretation:  ((d^)  (d2mj  (d^mj) 


Figure  2.  An  Interpretat.on  Tree  Each  node  of  the  tree  defines  *  partial  interpret.!, on. 
where  the  level  of  each  ancestor  defines  a  sensory  data  point,  and  the  branch  leading  to  each 
such  node  defines  the  corresponding  model  face  An  example  of  a  partial  interpretation  is 
shown,  where  rf,  denotes  the  .  data  point  and  mk  denotes  the  ktK  model  face 

sensory  data,  and  the  explicit  form  of  the  model  faces,  and  then  consider  constraints 
on  the  assignment  of  one  to  the  other  that  can  be  used  to  restrict  the  search  process. 

2.2  Object  models  and  sensory  data 

In  this  note,  we  restrict  ourselves  to  two-dimensional,  or  laminar,  objects,  although 
much  of  the  work  has  been  extended  to  three  dimensions  Crimson  and  Loaano Perez 

8-4-  87  We  allow  our  two-dimensional  object  models  to  consist  of  two  different  types 
of  components. 

The  first  type  of  component  is  a  linear  edge  fragment,  consisting  of  two 
endpoints,  and  a  uait  vector  normal  to  the  line  between  them,  and  pointing  away 
from  the  interior  of  the  object  Formally,  this  is  given  by 

linear,  =  (n„  (b,.  e, ) ) . 

Note  that  a  point  on  the  edge  can  be  represented  by 

n,  and  b,  -  att,.  a,  -  0.  Lt 
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where  n,  is  the  unit  norma!  vector,  t,  is  a  unit  tangent  vector,  oriented  so  that  it 
points  from  b  to  e,  and  a,  can  vary  from  0  to  the  length  of  the  edge  (see  Figure 
3). 


# 


Figure  3.  The  representation  of  an  edge.  An  edge  is  given  by  the  pair 

n,  and  b,  +  o,t,,  a,  €  [0.  £,| 

where  n,  is  a  unit  normal  vector,  t,  is  a  unit  tangent  vector,  oriented  so  that  it  points  from 
b  to  e,  b,  is  a  vector  to  the  base  point  of  the  edge,  and  o,  can  vary  from  0  to  the  length 
of  the  edge  lt. 

The  second  type  of  component  is  a  circular  arc,  consisting  of  a  center,  a  radius,  a 
pointing  direction,  and  a  range  of  angles,  measured  r< !  *i\e  to  the  x  axis.  Formally, 
this  is  given  by 

circular,  =  (c,,  r,,  dt,  (d>,,  v, )) 

The  pointing  direction,  if  known,  is  an  indication  of  which  side  of  the  circular  arc 
is  the  interior  of  the  object.  Specifically,  it  indicates  whether  the  circular  arc  is  the 
boundary  of  a  hole,  or  if  it  is  on  the  exterior  of  the  object.  An  example  of  a  circular 
segment  is  shown  in  Figure  4. 

We  assume  that  both  the  sensory  data  fragments  and  the  object  models  are 
composed  of  such  components.  We  will  discuss  shortly  how  to  obtain  such  fragments 
from  grey  level  images. 

2.3  Constraints  between  object  models  and  sensory  data 

(liven  such  simple  fragments,  we  now  consider  how  to  use  them  to  reduce  the  search 


Figure  4.  The  representation  of  .a  circular  arc.  An  arc  is  defined  by  a  center  c,,*  radius 
rt,  a  pointing  direction  dt,  and  a  range  of  angles  (<j,,v;t). 


process.  We  consider  both  unary  and  binary  constraints.  Since  the  transformation 
from  model  to  sensor  coordinates  is  one  of  the  things  to  be  determined,  awe  need 
constraints  that  can  compare  coordinate  frame  independent  measurements  from 
sensory  and  model  fragments. 


2.3.1  U nary  constraints 


Length  constraint 


Consider  a  linear  data  fragment, 

linear,  =  (»,,  (b,,e,)) 
and  a  possible  matching  model  fragment 


LINEAR,  =  (N„(B,„£,)) 


We  let  £,  denote  the  length  of  the  data  fragment,  and  Lp  denote  the  corresponding 
length  of  the  model  fragment,  where  these  lengths  are  given  by 


lb,  -  e.j, 


We  let 


length-constraint(i,p)  =  True  iff  f,  <  Lv  -  ( l 
capture  the  notion  of  the  unary  length  constraint,  where  is  a  predefined  upper 
bound  on  the  amount  of  error  inherent  m  measuring  the  length  of  an  edge 

This  constraint  says  that  if  the  length  of  the  i'h  linear  data  fragment  is  less 
than  the  length  of  the  pth  linear  model  fragment,  subject  to  some  bounded  error, 
then  it  is  possible  to  consistently  assign  this  data  fragment  to  lie  on  this  model  one 
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Note  that  there  is  only  an  upper  inequality  in  determining  consistency,  since  a  data 
fragment  could  be  partially  occluded. 

Radius,  swept  angle  and  pointing  constraints 

Now  consider  a  circular  data  fragment, 

circular,  (c, .  r,,  d„  (o„  v, )) 
and  a  possible  matching  model  fragment 

CIRCULAR,,  =  (C,,.  Hf,.  D,,,  ($,„  *,,)) . 

We  can  define  three  unary  constraints  in  this  case.  We  let 
radius-constraint(i,p)  -  True  iff  |r,  -  Rp\  <  (r 
swept-angle-constraint(i,p)  -  True  iff  ip,  <pt  <  'l',,  -  $v  —  cc 

pointing-constraint(i,  p)  —  True  iff  d,and  Dp  are  known  and  identical. 
Here,  er  is  a  predefined  upper  bound  on  the  amount  of  error  inherent  in  measuring 
the  radius  of  a  circular  arc,  and  <c  is  an  upper  bound  on  the  amount  of  error  inherent 
in  measuring  the  angular  extent  of  a  circular  arc. 

These  constraints  say  that  if  the  radii  of  the  ith  data  fragment  and  the  pth  model 
agree  to  within  some  error,  their  ranges  of  swept  angles  agree  to  within  some  error, 
and  they  are  both  either  interior  or  exterior  arcs,  then  it  is  possible  to  consistently 
assign  this  data  fragment  to  lie  on  this  model  one. 

2.3.2  Binary  constraints 

Consider  two  linear  data  fragments, 

linear,  =  (n,,(b,,e,))  linear,  =  (n,,(b,,e,)) 
and  two  possible  matching  model  fragments 

LINEAR,,  =  (N,„(B,,,E,,))  LINEAR,  -  (N„  (B„ E, ) )  . 

We  need  to  derive  a  set  of  constraints  that  will  determine  the  consistency  of  assigning 
the  data  fragments  to  lie  on  the  model  ones. 

Angle  constraint 

Let  0l}  denote  the  angle  between  n,  and  ii,,  and  let  0,,,  denote  the  angle  between 
N„  and  N,.  We  let 

binary-angle-constraint(i,  j.p,  <j)  True  iff  9tJ  f  0,,,  2*  -  h a 

where  all  arithmetic  comparisons  are  performed  modulo  2 n  and  where  <„  is  an  upper 
bound  on  the  amount  of  error  inherent  in  determing  the  direction  of  a  normal. 

This  says  that  if  the  angle  between  the  data  normals  agrees  with  the  angle 
between  the  model  normals,  within  some  error,  then  it  is  possible  to  consistently 
assign  these  data  fragments  to  be  on  these  model  ones. 


Distance  constraint 


(liven  two  data  fragments,  there  is  a  range  of  distances  associated  with  the  family  of 
vectors  having  tail  on  one  edge  and  head  on  the  other.  We  can  compute  the  range  of 
such  distances,  denoted  by  \d,  t],dh  tJ\.  in  a  straightforward  manner.  If  i  =  j,  then 
the  minimum  distance  is  0  and  the  maximum  distance  is  the  length  of  the  edge.  In 
the  more  genera!  case,  let  pfv.u)  denote  the  distance  between  two  points.  Then  the 
maximum  distance  is  given  by 

dh.ij  max{p(-v,u)'V  •  {b,.-e,},u  6  fb,.-e,}}. 

For  the  minimum  distance,  we  must  also  consider  the  possibility  that  the  projection 
from  an  endpoint  of  one  edge  in  the  direction  of  the  normal  of 'the  second  edge 
intersects  that  edge,  so  that 

dt',,  -  min{  {p(v.  u)iv  •  |b,,ot},ut  {bj.e,}} 

j  {p(v.b,  -  (v  b, ,£,)'£,) tv  -  {bj.-Cj}.  <v  >'b, ,-t,)  £  j0,'£,!} 

{p(v,  b;  -  (v  -  bj.tj/tJtv  t:  {b,,e,},  v  -  -b,  ,.*£,)  6 

where  we  let  <  >  denote  the  dot  (or  inner)  product  of  two  vectors.  For  the  model 

fragments,  we  can  compute  similar  ranges,  which  we  denote  by  Dhq,q.  We  let 

distance-constraint(i.  j,p.q)  =  True  iff  | di  tJ,dh  lJ  C  \Dt,pq  -  2c,,.  Dhpq  -*■  2c,, j 
where  we  assume  that  the*  position  of  an  edge  point  is  known  to  within  an  error 
bound  c,,. 

This  says  that  if  the  range  of  distances  between  the  data  edges  is  contained 
within  the  range  of  distances  between  the  model  edges,  subject  to  some  error,  then 
it  is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these  model  ones. 


Component  constraint 


The  third  constraint  concerns  the  separation  of  the  two  edge  fragments.  In  particu¬ 
lar,  we  consider  the  range  of  components  of  a  vector  between  the  two  edge  fragments, 
in  the  direction  of  each  of  the  edge  normals.  Algebraically,  this  is  expressed  by  the 
dot  product 

b,  *  Q,t,  b,  - 


which  reduces  to 


'b,  b,.n,  a,  ;trn,  a;  ?  id .  f 7 

Of  course,  there  is  an  equivalent  constraint  for  components  in  the  direction  of  n;. 
Note  that  this  expression  actually  determines  a  range  of  values,  with  extrema  when 
a,  O.f,.  We  denote  this  by 

d,  ,,  min {  b,  b,  ii,  o,  t,.ii,  o,  {<).  f,}} 

<h, mm  {  •*,  b , .  Ii,  o,  <\ ,  ■  {(>.(,)} 

These  ranges  can  be  computed  both  for  pairs  of  data  edges  and  pairs  of  model 
<  dges  In  the  ideal  case,  consistency  will  hold  only  if  the  data  range  is  contained 


within  the  model  range  (since  the  data  edges  may  correspond  to  only  parts  of  the 
model  edges).  As  in  the  case  of  the  other  constraints,  we  also  need  to  account  for 
error  in  the  measurements.  We  derive  a  simple  method  for  doing  this  below. 


Figure  5.  Errors  in  computing  the  direction  constraint.  - '  The  component  of  a  vector 
from  one  endpoint  in  the  direction  of  the  other  edge's  normal  is  given  by  the  perpendicular 
distance  d  to  the  extended  edge,  (b)  Since  the  actual  normal  is  only  accurate  to  within 
ia,  one  extreme  case  is  given  by  rotating  the  extended  edge  about  its  midpoint  by  that 
amount  and  finding  the  new  perpendicular  distance,  (c)  The  other  extreme  is  obtained  by 
considering  the  other  endpoint. 

Consider  the  base  case,  shown  in  Figure  5a.  The  perpendicular  distance  from  the 
endpoint  of  one  edge  to  the  other  edge  is  shown  as  D~  .  In  Figure  5b.  the  edge  is 
rotated  by  („  about  its  midpoint,  and  the  new  perpendicular  distance  X  is  shown. 
We  need  to  relate  X  to  measurable  values  We  already  have  D" .  We  can  also 
measure  S,  the  distance  from  the  midpoint  of  the  edge  to  the  perpendicular  dropped 
from  the  endpoint  of  the  other  edge,  as  shown.  Straightforward  trigonometry  then 
yields  the  new  distance 

X  ( D  S  sin  f )  ruse, 


Since  the  position  of  the  second  edge  is  not  known  exactly,  we  must  adjust  this 
expression,  to  yield  one  limit  on  the  range  of  possible  measurements: 

DiPq  ~  (d  L  ~  •S  sin  <„)  COSfa 

The  other  extreme  is  shown  in  Figure  5c.  Trigonometric  manipulation  yields 
the  following  upper  bound 

Dh,P<i  ~  (S  ~  sin  sin  e(1  -r  D  sect,  - 
Thus,  given  two  model  edges  indexed  by  q.  we  can  compute  a  range  of  possible 
measurements  (modulo  known  error  bounds),  by  using  and  Dh  computed 

over  all  the  endpoints  of  the  edges.  We  denote  this  range  by  \. 

We  let 

component-constraint  (t,  j,p,q)  -  True  iff  C  \M£vq,M£pq\. 

This  says  that  if  the  range  of  distance  components  between  the  data  edges  is  con¬ 
tained  within  the  corresponding  range  between  the  model  edges,  subject  to  some 
error,  then  it  is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these 
model  ones. 

Circle  center  constraint 

Now  consider  two  circular  data  fragments, 

circular,  =  (c,,r,,d„  (<*>,,  v,))  circular,  =  (c, ,  r,,  dv  (<*>,, 
and  two  possible  matching  model  fragments 

CIRCULAR,,  -  (C„,  ft,,.  Dv ,  {%,  *„))  CIRCULAR,,  (C„  ft,,  D„  (4>„  *,)) . 

We  need  to  derive  a  set  of  constraints  that  will  determine  the  consistency  of  assigning 
the  data  fragments  to  lie  on  the  model  ones. 

The  first  constraint  arises  from  considering  the  distance  between  the  centers  of 
the  two  circles,  given  by  p( c,,c, ),  in  the  case  of  the  two  data  fragments.  We  let 

center-constraint(i,  j,p.q)  =  True  iff  p{ c,,c;)  G  ;p(C,,,  C,)  -  2(rrf,  p(C,,,  C,)4  2(cd 
where  ecj  is  an  upper  bound  on  the  amount  of  error  inherent  in  measuring  the 
position  of  the  center  of  a  circular  arc 

This  says  that  if  the  distance  between  the  centers  of  two  data  arcs  is  within 
some  bounded  error  of  the  distance  between  the  centers  of  the  model  arcs,  then  it 
is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these  model  ones 

Circle  swept  angle  constraint 

The  ranges  of  swept  angles  of  two  fragments  must  also  be  constrained.  We  let 
hinary-8wept-angle-constraint(».  j  p.  q)  True  iff  <z>,  v,  •'  4>,  'k,,  •  2< , 

and  t  ,  o,  'k,  <k,  2< 

where  as  before  <,  is  the  amount  of  error  inherent  in  measuring  the  angular  extent 
of  a  cirrular  are.  and  the  angles  are  measured  modulo  2rr. 
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This  says  that  if  the  difference  in  the  range  of  swept  angles  between  two  data 
arcs  is  contained  within  the  corresponding  range  of  two  model  arcs,  subject  to  some 
error,  then  it  is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these 
model  ones. 


2.3.3  Cross  type  constraints 

All  of  the  binary  constraints  described  above  deal  with  the  relationship  between  pairs 
of  data  fragments  of  the  same  type,  and  corresponding  pairs  of  model  fragments. 
Note  that  there  are  other  constraints  possible,  especially  cross  constraints  between 
fragments  of  opposite  type. 


Cross  distance  constraint 

Consider  a  circular  data  fragment 

circular,  -  (c,,  r,,d,.  (d>,,  <£',)) 
and  a  linear  data  fragment 

linear,  (n,,(b,,e,)) 
and  corresponding  model  fragments 

CIRCULAR,,  =  (Cp,  Rf„  D,„(<JV  *,,))  LI  HEAR,,  -  (n„  (B„E,))  . 

The  range  of  distances  between  the  center  of  a  circular  fragment  and  a  linear 
fragment  can  be  constrained,  similar  to  the  distance  constraint  between  two  linear 
fragments.  We  let 

r  p(Cp,  B4  +  (C„  -  B^.T^  T^)  if  (Cp  -  B,,,  6  [0,  Lq\ 

=  min{p(Cp,B^),p(Cr,E^)}  otherwise 
Fh,pq  =  max{p(Cp,B,),p(C,„E^)} 
and  we  let 

cross-distance(t,j,p,  q)  =  True  iff'  p(c,,b,)  e  | Ffpq  -  c(  J  -  cp,  FhiPq  +  tcd  +  f,,j 

and  p(c, ,  )  *  i  Ff  pq  —  t  cd  —  ep .  F/ liPq  -t-  ccd  ■ 

This  says  that  if  the  range  of  distances  from  the  center  of  the  data  circle  to  the 
data  edge  is  contained  within  the  corresponding  model  range,  subject  to  error,  then- 
it  is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these  model  ones. 

Cross  component  constraint 

(liven  I  lie  same  dala  and  model  fragments  as  above,  we  can  consider  the  perpen¬ 
dicular  distance  from  the  circle  center  to  the  extended  line  defined  by  the  linear 
segment . 
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Using  the  same  method  described  for  the  component  constraint  between  two 
linear  fragments,  we  let 

DtPq  =  iD±  -  s s‘n<„)  cos ia  - 

Dh,pq  ~  iS  ~  D±  S>n«<i)  S'n(a  -  D  sect,,  •  t,, 

where  in  this  case 

D-  (b„  0,,.N,) 

We  let 

cross-component (i.  j.p.q)  -  True  iff  -  b;  o,.nj  •>-.  l)f  ,  Dk 

This  says  that  if  the  component  of  distance  from  the  center  of  the  data  circle  to 
the  data  edge  is  contained  within  the  corresponding  model  range,  subject  to  error, 
then  it  is  possible  to  consistently  assign  these  data  fragments  to  lie  on  these  model 
ones. 

Cross  angle  constraint 

Let  be  the  angle  between  the  unit  normal  vector  of  the  linear  edge  and  the 

x  axis,  for  the  data  and  model  linear  fragments  respectively.  Then  the  range  of 
angles  between  the  swept  angles  of  the  circular  fragment  and  the  unit  normal  must 
be  consistent. 

We  let 

cross-angle(t, j>,p, q)  -  True  iff  \4>t  -1,\Z  <t>,  J  frj. 

2.4  The  constraints  reduce  the  search 

Given  these  unary  and  binary  constraints,  the  constrained  search  process  can  be 
straightforwardly  specified.  Suppose  the  search  process  is  currently  at  some  node  at 
level  k  in  the  interpretation  tree  and  with  a  consistent  partial  interpretation  given 
by 

We  now  consider  the  next  data  fragment  dk^  i,  and  its  possible  assignment  to  model 
face  where  jk*\  varies  from  1  to  n  ‘  1. 

The  following  rules  hold. 

•  If  m]k±l  is  the  wild  card  match,  then  the  new  interpretation 

{(d|.m„),(d2.m,  ). . .  {dk . , .  rri7l . ,)} 
is  consistent,  and  we  continue  downward  in  our  search. 

•  If  rnJkk ,  is  a  linear  edge  segment  ,  we  must  verify  that 

longth-ooiistraiiit(A-  •  l.jk,  |)  True. 

Moreover,  for  all  i  •  {I . k)  such  that  d ,  is  a  linear  edge  fragment  we  rini't 

verify  that 

biiiary-;»ngJ«*-r«>ij.str»int(».  k  <  .  i)  True 
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distance-constraiiit(t, k  -  1,  /,.  jk  .  -  True 

component-constraint (i,  k  l,j,.jk~t)  z  True 
component-constraint (k  i  l  i  j,)  --  True. 

And,  for  all  i  <E  {1, - Ar}  such  that  d ,  is  a  circular  edge  fragment,  we  must 

verify  that 

cross-distance(i,  k  t  \.j,,jk-i)  —  True 
cross-component(j,  k  -f  l.j,.jk ,  i)  -  True 
cross-angle(i,  k  +  l,j,.  - 1)  True. 

•  If  m>+1  is  a  circular  arc  segment,  we  must  verify  that 

radius-constraint  (A:  +  l,jk->i)  —  True 
pointing-constraint(Ar -r  1 , j )  -  True 
swept-angle-constraint(Ar  1 ,  [ )  =  True. 

Moreover,  for  all  «  6  such  that  d,  is  a  circular  arc  fragment,  we  must 

verify  that 

binary-swept-angie-constraint(i,  k  -r  l,jt,jk  +  i)  -  True 
binary-swept-angle-constraint(fc  +  =  True 

center-constraint(i,  k  +  =  True. 

And,  for  all  «  t  {1, . ..  ,fc}  such  that  d,  is  a  linear  edge  fragment,  we  must  verify 
that 

cross-distance(fc  -  i ,  t ,  jk  ,  \j, . )  True 
cross-component (k  +  1 ,  i,jk~  ij,  )  -  True 
cross-angle^  4  1 .  t  ,  jk  „ 1 J,, )  True. 

•  If  all  of  these  constraints  are  true,  then 

{ Mi ,  rn;,  ),(d2,  . ..  (d*  .  , .  mJ(  4 . ) } 

is  a  consistent  partial  interpretation,  and  we  continue  our  depth  first  search.  If 
one  of  them  is  false,  then  the  partial  interpretation  is  inconsistent.  In  this  case, 
we  increment  the  model  face  index  7*. .  ]  by  1  an  i  try  again,  until  jk  .i  n  +  1 
If  the  search  process  is  currently  at  some  node  at  level  A-  in  the  interpretation  tree, 
and  has  an  inconsistent  partial  interpretation  given  by 

{(di,n*J,).(d2,fnJJ,...(dA,wiJ,)} 

then  it  is  in  the  process  of  backtracking.  If  jk  -  n  t  1  (the  wild  card)  we  backtrack 
up  another  level,  otherwise  we  increment  jk  and  continue. 


2.5  Model  tests 


Once  the  search  process  reaches  a  leaf  of  the  interpretation  tree,  we  have  accounted 
for  all  of  the  data  points.  We  are  now  ready  to  determine  if  the  interpretation  is  in 
fact  globally  valid.  To  do  this,  we  solve  for  a  rigid  transformation  mapping  points 
v,„  in  model  coordinates  into  point s  v  /  in  sensor  <  oordi nates. 


V,, 


V/ 


M 

where  R  is  a  rotation  matrix,  v(i  is  a  translation  vector,  and  s  is  a  scale  factor  We 
can  solve  for  this  transformation  in  a  aumber  of  -ways  ie?g.  Grimson  and  Lozano- 
Ferez  84,  87,  Ayache  and  Faugeras  '86, .  The  method  described  in  lAyache  and 
Faugeras  86  deals  with  finding  transformations  from  line  segments  to  line*  segments. 
It  is  straightforward  to  extend  the  method  to  deal  with  transformations  of  sets  of 
points  (the  centers  of  the  circular  arcs)  to  sets  of  points.  In  our  implementation 
(described  later)  we  solve  for  two  transformations,  one  based  on  the  linear  fragments 
of  the  match,  and  one  based  on  the  circular  fragments.  We  then  require  that  the 
two  transformations  be  roughly  identical. 

Given  such  a  transformation,  which  is  usually  some  type  of  least  squares  fit. 
we  must  then  ensure  that  the  interpretation  actually  satisfies  it.  We  do  this  by 
considering  each  of  the  linear  data  fragments  associated  with  a  real  model  face  in  the 
interpretation,  and  transforming  the  associated  linear  model  face  by  the  computed 
transform.  For  each  such  face,  we  then  verify  that  the  transformed  fragment  differs 
in  position  and  orientation  from  its  associated  data  fragment  by  amounts  that  are 
less  than  some  acceptable  error  bounds.  These  bounds  on  transform  error  can  be 
obtained  from  the  predefined  bounds  on  the  sensor  error  [Grimson  86b  .  For  each  of 
the  circular  data  fragments  associated  with  a  real  model  face  in  the  interpretation, 
we  also  transform  the  model  fragment  into  sensor  coordinates  In  this  case,  we  both 
verify  that  the  transformed  center  of  the  circular  arc  lies  within  some  bounded  error 
of  the  center  of  the  associated  data  fragment,  and  that  the  set  of  actual  points  lying 
on  the  data  circular  arc.  are  within  a  bounded  distance  of  some  point  among  the 
set  of  transformed  model  points.  Any  interpretation  that  passes  such  a  model  test 
is  a  consistent  interpretation  of  the  data 

2.6  Additional  search  reductions 

While  the  constrained  search  technique  described  above  will  succeed  in  finding  all 
consistent  interpretations  of  the  sensory  data,  for  a  given  object  model,  it  is  not  par¬ 
ticularly  computationally  efficient.  This  is  mostly  due  to  the  problem  of  segmenting 
the  data  to  determine  subsets  that  belong  to  a  single  object  Indeed,  for  the  case 
of  linear  fragments  only,  if  all  of  the  sensory  data  do  belong  to  one  object,  the  de¬ 
scribed  method  is  known  to  be  quite  efficient,  as  has  l>een  verified  both  empirically 
Grimson  and  Lozano-Perez  84,  87j  and  theoretically  Grimson  1986a  .  In  order  to 
improve  the  efficiency  of  the  method,  we  add  two  additional  methods  to  our  search 
process,  both  previously  discussed  for  the  case  of  linear  fragments  in  Grimson  and 
Lozano- Perez  87  ,  and  extended  here  to  circular  segments. 

Hough  transforms 

The  first  is  to  use  the  Hough  transform  Hough  (i'J.  Merlin  and  barber  7a.  Sklansky 
78.  Hallard  81  as  a  preprocessor  to  restrict  our  attention  to  small  portions  of  the 
search  space  In  brief,  the  Hough  transform  works  as  follows.  Consider  a  three 
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dimensional  configuration  space,  with  axes  denoting  t lie  x  and  y  components  of  a 
translation  vector,  along  with  a  6  axis  which  denotes  an  angle  of  rotation.  Thus, 
any  point  in  this  space  defines  a  unique  rigid  transformation.  We  tesselate  the  space 
into  buckets,  based  on  some  sampling  of  the  axes,  say  hx.hy,h^. 

Now,  we  consider  a  linear  data  edge,  t  and  a  linear  model  fragment,  indexed  by 
q.  Suppose  that  the  data  edge  is  shorter  than  the  model  edge,  subject  to  the  error 
in  measuring  length  f/,.  There  is  a  unique  angle,  call  it  0',  needed  in  order  to  rotate 
the  model  fragment  so  that  its  normal  aligns  with  the  data  fragment's  normal.  Let 
Rh'  denote  the  rotation  matrix  associated  with  this  rotation  angle.  Ignoring  for  the 
moment  the  effects  of  scaling,  any  point  on  the  model  edge  can  be  transformed  into 
sensor  coordinates,  as 


u  =  R1  IB,  +  /9T,  -  v0]. 


Any  translation  v0  such  that  the  endpoints  of  the  transformed  line  lie  within  range 
of  the  associated  data  line  are  possible  valid  translations.  Since  the  data  edge  may 
be  partially  occluded,  it  will  in  general  be  shorter  than  the  model  edge  and  hence 
there  w  ill  be  a  range  of  translations  possible,  corresponding  to  sliding  the  shorter 
edge  along  the  longer  one. 

These  translations  are  given  by  the  set  of  v0  that  satisfy 

(u  -  b,,n,)  e 

(u  -  b,,t.)  €  f-e +  (L] 

for  all  3  €  '0,  Lq  .  If  we  let  v0  -  cn{R n,)  *  c,(Rt,)  then  the  range  of  possible 
translations  is  given  by 


B^.  Rn,)  ■  (b,,n,)  -  •  max  |(),  Lq  ^T,,  Rh,^  j 


(B,.  Rn ,  -  b,,n.)  - 


By.  Rt,)  (b,,t,}  -  ix  (L  4  max 


<B,.  Rt , 


in  {o,L,(f. ,./»,)}] 

L  4  max  J'  RilSJ  J  , 

in  {0.L,(f 


Thus,  for  the  given  rotation  9' ,  these  expressions  define  a  polygon  in  the  translation 
subspace  of  the  Hough  space.  Any  bucket  in  the  tesselate  Hough  space  that  intersects 
this  polygon  denotes  a  possible  transformation  consistent  with  the  given  pairing  of 
data  and  model  fragment.  Thus,  we  place  the  pair  (linear,,  LINEAR^)  into  each 
such  bucket.  This  computation  was  done  assuming  a  rotation  O'  based  on  aligning 
the  data  normal  and  the  model  normal  Since,  in  general.  t!.<rc  may  be  error  in  the 
data  normal,  we  repeat  the  above  process  for  a  sampling  of  angles,  chosen  from  the 


O'  r.,.0'  •  t„ 

If  the  data  edge  is  longer  than  the  model  edge,  subject  to  the  error  in  measuring 
length,  then  nothing  is  done. 
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A  similar  computation  holds  for  the  pairing  of  a  circular  data  arc.  i  and  a 
circular  model  arc,  q.  Suppose  that  the  radius  of  the  data  arc  agrees  with  the  radius 
of  the  model  arc,  subject  to  the  error  in  measuring  such  radii.  Then,  given  a  rotation 
angle  6 ,  the  condition  on  the  translation  part  of  the  transformation  is  simply  given 
by 

(RCq  t  v o  -  ct,  RCq  +  Vo  c, ;  <  (2cd. 

In  general,  as  we  sweep  through  all  possible  rotation  angles,  the  position  of  this 
circle  of  possible  translation  vectors  will  trace  out  a  helix  in  Hough  space.  However, 
we  need  only  consider  the  range  of  angles  6'  such  that  the  rotated  range  of  swept 
angles  for  the  model  arc  will  lie  within  the  range  of  swept  angles  for  the  data  arc, 
subject  to  error  in  measuring  such  swept  angles  For  each  such  angle,  there  will  be  a 
set  of  translations  associated  with  it.  Now,  as  befc.-'-“,  for  every  bucket  in  the  Hough 
space  that  interesects  the  circle  defined  by  the  above  condition,  we  place  the  pair 
(circular,,  CIRCULAR,)  into  the  bucket.  If  the  radius  of  the  data  arc  does  not  agree 
with  the  radius  of  the  model  arc,  subject  to  the  error  in  measuring  such  radii,  then 
nothing  is  done. 

We  can  repeat  this  process  for  all  possible  pairings  of  data  elements  to  model 
fragments,  adding  pairs  to  appropriate  Hough  buckets.  Each  such  pair  essentially 
votes  for  the  set  of  transformations  with  which  it  may  be  consistent.  Having  done 
this,  we  can  then  rank  the  Hough  buckets.  We  do  this  by  assigning  to  each  bucket 
a  measure,  determined  by  the  sum  of  the  lengths  of  the  linear  data  edges  assigned 
to  the  bucket  plus  the  sum  of  the  arc  lengths  of  the  circular  data  edges  assigned  to 
that  bucket.  This  allows  us  to  sort  the  Hough  buckets,  in  decreasing  order. 

Now  each  bucket  defines  a  new  interpretation  tree  It  contains  a  number  (usu¬ 
ally  much  less  than  the  total  number)  of  data  fragments,  and  associated  with  each 
one  is  a  set  of  possible  matching  model  fragments.  By  adding  the  wild  card  character 
as  before,  we  can  apply  our  constrained  search  process  to  this  much  smaller  inter¬ 
pretation  tree,  to  obtain  consistent  interpretations  We  can  simply  search  through 
the  Hough  buckets  in  sorted  order  until  we  obtain  a  valid  interpretation. 

Note  that  this  process  has  ignored  the  effect  of  scale  in  the  object  transforma¬ 
tion.  We  can  incorporate  scale  in  at  least  two  different  ways.  The  first  would  be 
to  add  an  additional  dimension  to  our  Hough  space,  and  then  to  place  data-model 
pairs  in  this  four  dimensional  space  based  on  the  set  of  translation,  rotation  and 
scale  factors  consistent  with  such  a  pairing  A  second  method  is  to  increase  the  num¬ 
ber  of  buckets  into  which  a  data-model  pair  are  placed  by  increasing  the  bounds  on 
the  distance  allowed  between  a  Hough  bucket  and  the  set  of  translation  and  rota¬ 
tion  factors  deemed  consistent  with  a  pairing.  By  placing  bounds  on  the  range  of 
possible  scale  factors,  one  ran  determine  appropriate  bounds  on  this  distance.  Note 
that  such  a  range  of  scale  factors  will  only  affect  the  translation  components  of  the 
Hough  space.  In  our  implementation,  we  choose  the  latter  approach 

Also  note  that  we  need  not  use  a  Hough  space  whose  dimensionality  matches 
the  number  of  degrees  of  freedom  of  the  object  models,  since  we  are  not  relying  on 
the  Hough  transform  to  directly  interpret  the  data  bather,  since  we  only  use  the 
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Hough  transform  In  reduce  l  hr  search  spare,  we  ran  use  any  number  of  dimensions 
in  our  Hough  space,  trading  off  expense  of  computing  the  Hough  transform  against 
the  gain  in  reduction  of  the  final  search  space. 


Premature  termination 


We  can  add  a  second  heuristic  to  our  search  method,  which  also  drastically  reduces 
the  effort  involved.  Suppose  we  have  reached  a  leaf  in  one  of  our  interpretation 
trees,  and  the  interpretation  associated  with  it  is  consistent.  Since  many  of  the  data 
fragments  in  the  interpretation  are  likely  to  have  been  assigned  the  wild  card  char¬ 
acter,  our  search  method  would  proceed  to  backtrack,  attempting  to  find  another 
interpretation  that  accounted  for  more  of  the  data.  In  many  cases,  this  is  a  fruitless 
task  [Grimson  and  Lozano-Perez  87’.  We  can  truncate  this  search,  at  the  possible 
risk  of  occasionally  misinterpreting  the  data.  In  particular,  we  can  apply  a  mea¬ 
sure  of  goodness  of  match  to  each  consistent  interpretation.  If  that  measure  exceeds 
some  predefined  threshold,  then  we  can  accept  the  interpretation,  and  terminate  the 
search  in  that  particular  interpretation  tree  Reasonable  measures  of  match  include 
the  number  of  data  fragments  accounted  for,  and  a  measure  of  the  percentage  of  the 
object  model  accounted  for.  determined  by  the  ratio  of  the  sum  of  lengths  of  the 
linear  data  fragments  accounted  for  plus  the  sum  of  the  arc  lengths  of  the  circular 
data  fragments  accounted  for,  relative  to  the  overall  perimeter  of  the  object  model. 
In  our  implementation,  we  use  the  perimeter  method. 

These  two  techniques  can  be  combined  to  produce  a  very  efficient  recognition 
system.  We  can  search  through  the  sorted  Hough  buckets,  applying  our  constrained 
search  method  to  the  interpretation  tree  defined  by  the  bucket  contents.  If  we  find 
an  interpretation  that  exceeds  our  predefined  measure  of  match,  we  can  remove  the 
data  fragments  that  have  been  accounted  for,  adjust  our  Hough  buckets  accordingly, 
and  continue  the  process,  until  we  have  either  identified  all  of  the  edges  in  the  data, 
or  all  of  the  Hough  buckets  have  been  exhausted 

Note  that  in  using  a  cutoff  based  on  percentage  of  object  accounted  for.  one 
can  weight  the  edges  based  on  relative  importance,  possibly  by  using  a  measure  of 
saliency  Turney.  Mudge  and  Volz.  86  . 


3.  Getting  the  fragments  from  real  data 


We  have  assumed  that  both  the  object  models  and  the  sensory  data  consist  of  sets 
of  edge  fragments,  both  linear  and  circular,  as  eliarai  t eri/ed  in  Section  If. 2  Given 
-.•li  lt  asMimpt  unis,  we  have  developed  a  constrained  -cun  li  technique  that  will  find 
interpret  at  ions  of  t  he  data  relat  ive  to  the  model  We  must  show  .  however,  that  the 
assumption  on  the  form  of  the  models  and  M-nsory  data  is  valid  To  do  tins,  we 
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describe  a  method  for  obtaining  linear  and  circular  edge  fragments  from  grey  level 
images.  This  will  be  used  both  to  build  the  object  models  automatically,  and  to 
process  the  input  sensory  data. 

The  first  stage  in  our  processing  is  the  extraction  of  sharp  intensity  changes 
in  the  grey-level  input  image.  There  is  a  large  body  of  literature  on  the  problem 
of  edge  detection,  and  any  of  several  different  edge  detectors  would  suffice  for  our 
purposes.  For  a  variety  of  reasons,  we  use  a  Marr-Hildreth  Marr  and  Hildreth  80 
Laplacian  of  Gaussian  edge  detector.  Applying  this  operator  to  the  image  reduces 
the  sensory  input  to  an  array  of  connected  edge  points,  where  a  1  in  a  pixel  indicates 
an  edge  point,  and  all  other  points  are  0 

Next,  we  extract  connected  contours  front  this  array.  This  can  be  done  by  a 
simple  tracing  operation.  Note  that  it  is  not  critical  if  missing  edge  points  cause  the 
tracing  operation  to  fragment  the  edge  contours  into  a  set  of  smaller  ones. 

As  we  extract  each  edge  point  of  a  contour,  we  record  two  pieces  of  information, 
an  estimate  of  the  local  orientation  of  the  edge  at  that  point,  and  an  estimate  of  the 
change  in  arclength  between  the  previous  edge  point  and  the  current  one  Since  the? 
measurements  tend  to  be  noisy,  we  smooth  both  of  them  by  recursive  averaging 
This  yields  a  transformed  representation  of  the  edge  contour,  now  mapped  into  an 
arclength-orientation  (0-s)  space  Perkins,  78.  80,  McKee  and  Aggarwal  77 

The  advantage  of  such  a  transformation  is  that  the  edge  fragments  are  now 
easily  extracted.  Note  that  a  straight  line  in  the  original  image  space  maps  to  a 
horizontal  line  in  (0-s)  space,  and  a  circular  arc  in  the  original  image  space  maps 
to  a  slanted  line  in  (0-s)  space.  Thus,  to  extract  our  edge  fragments,  we  simply 
need  to  parse  the  (0-.s)  space  representation  We  do  this  by  applying  a  simple  split  - 
and-merge  Horowitz  and  Pavlidis  76.  Chen  and  Pavlidis  79'  algorithm  to  extract 
linear  segments  from  the  transformed  representation.  To  do  this,  we  must  specify 
a  bound  on  the  maximum  deviation  between  the  straight  line  and  the  contour 
being  approximated 

Any  non-horizontal  line  identifies  a  circular  arc.  Note  that  to  determine  hori¬ 
zontal  from  non-horizontal  lines,  we  need  a  bound  on  the  angle  between  the  line  and 
the  s-axis.  sav  <*.  The  radius  of  the  cm  i  lar  arc  is  given  by  the  inverse  slope  of  the 
linear  segment  in  (0-s)  space.  To  find  the  center  of  the  circle,  we  use  the  following 
method  First,  we  transform  the  circular  segment  bark  into  the  image  space,  and 
choose  a  sampling  of  pairs  of  points  from  the  transformed  segment,  bet  2 f  denote 
the  separation  of  the  two  points  Next,  we  construct  a  perpendicular  bisector  to 
this  chord  The  renter  of  the  circle  must  lie  a  distance  y  r2  ■  f2  along  the  bisector. 
We  can  determine  on  which  side  of  the  chord  the  center  lies,  by  ensuring  that  the 
points  between  the  two  sample  points  lie  on  the  opposite  side  of  the  chord  We  ran 
collect  all  such  hypothesized  centers,  over  some  set  of  sample  points  and  use  the 
midpoint  of  t  he  <  oiled  ion  to  determine  I  he  i  i rc  le  center 

fills  give-  us  an  estimate  of  the  center  of  the  circle  Since  the  computation 
of  the  circle  radius  may  be  noisy,  we  can  extend  this  met  hod  b\  performing  I  lie 
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above  computation  for  a  range  of  possible  values  for  t lie  radius.  For  each  hypothe¬ 
sized  radius  and  center,  we  can  measure  the  deviation  of  the  data  points  from  the 
hypothesized  circular,  and  select  the  circle  with  minimum  error. 

Given  the  circle  center  and  the  two  endpoints  of  the  circular  arc  in  image  coor¬ 
dinates.  we  can  determine  the  limits  on  the  swept  angle  straightforwardly.  Finally, 
if  we  know  the  sign  of  the  contrast  between  the  object  and  the  background,  we  can 
use  the  direction  of  the  change  in  edge  intensity  across  the  edge  to  determine  the 
pointing  direction  of  the  arc. 

To  ensure  that  the  computed  fragments  are  optimal,  we  perform  a  second  split- 
and-rnerge  stage,  this  time  in  the  image  space.  That  is,  given  a  circular  fragment, 
computed  as  above,  we  test  that  all  of  the  data  points  lie  within  a  given  error  range 
of  the  hypothesized  fragment.  If  they  do  not,  we  split  the  data  points  at  the  point 
of  maximum  deviation,  and  perform  the  same  computation  on  each  of  the  subparts. 

We  are  left  with  the  horizontal  lines  in  (©-*)  space  To  extract  the  linear  edge 
fragments,  we  transform  all  of  the  points  along  these  lines  back  into  the  image  space, 
and  run  the  split-and-merge  algorithm  again  in  this  space.  This  allows  us  to  extract 
the  endpoints  of  the  linear  fragments.  The  normal  is  orthogonal  to  t  he  line  between 
the  endpoints.  If  we  know  the  sign  of  the  contrast  between  the  object  and  the 
background,  we  can  use  the  direction  of  the  rhange  in  edge  intensity  across  the  edge 
to  determine  the  sign  of  the  normal. 

In  our  experience,  this  second  split-and-merge  stage  in  the  image  space  is  1m- 
|  portant.  If  we  simply  rely  on  the  first  split-and-merge  operation,  we  have  found 

that  the  actual  edges  in  the  image  space  corresponding  to  the  horizontal  lines  in  the 
(0-s)  space  have  significant  residual  curvature.  This  is  not  surprising,  since  in  one 
case  we  are  thresholding  based  on  deviation  in  curvature,  and  in  the  other,  we  are 
thresholding  based  on  deviation  from  linear.  As  a  consequence,  the  second  split- 
and-merge  stage  results  in  the  segmentation  of  horizontal  (©-«)  lines  into  several 
image  space  lines,  with  much  tighier  fit 


4.  Putting  it  all  together 

4.1  Building  the  library  of  objects 

We  now  have  the  pieces  needed  to  build  our  recognition  engine.  We  begin  by  building 
a  library  of  object  models.  This  is  accomplished  by  placing  each  part  in  isolation 
under  a  camera,  and  running  the  fragment  extraction  pro-  —-s  described  in  Section 
This  produces  a  set  of  linear  edge  fragments  and  a  set  of  circular  edge  fragments, 
defined  hi  a  local  coordinate  frame. 

We  can  improve  the  efficiency  of  our  recognition  system  by  doing  some  prepro¬ 
cessing  on  tins  representation.  In  particular,  for  each  object .  we  build  a  set  of  tables 
I,  rapt  ii ring  t  he  model  halves  of  each  of  t  he  const  rairits.  For  each  unary  const  raint .  we 
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build  a  one  dimensional  table,  indexed  by  face  number,  in  which  we  store  the  value 
of  the  model  half  of  the  constraint.  For  example,  for  the  length  constraint,  this 
would  involve  computing  and  storing  the  length  of  each  edge  plus  the  error  bound, 

Lt,  +  t  l 

For  each  binary  constraint,  we  build  a  two  dimensional  table,  in  which  we  store  the 
value  of  the  model  half  of  the  constraint.  For  example,  for  the  angle  constraint, 
this  would  involve  computing  and  storing  the  range  of  angles  between  a  pair  of  edge 
normals,  adjusted  for  error, 

~  Ca,©pq  -*■  Ca!‘ 

This  precomputation  makes  the  search  process  significantly  faster,  since  half  the 
computation  is  reduce  to  a  table  lookup. 

Having  built  a  model  for  a  single  object,  we  can  straightforwardly  build  a  second 
model  for  the  mirror  reversal  of  the  object.  This  gives  us  two  models  per  object, 
but  allows  us  to  recognize  laminar  objects  in  either  stable  orientation. 

4.2  Processing  the  sensory  data 

Once  we  have  constructed  the  library  of  objects,  we  are  ready  to  process  arbitrary 
images  of  the  objects.  Using  the  process  described  in  Sect  ion  3,  we  reduce  a  grey- 
level  image  of  a  pile  of  parts  to  a  set  of  linear  and  circular  edge  fragments.  Next, 
we  apply  a  Hough  transform  to  the  data,  for  each  model  in  the  object  library.  This 
yields  a  sorted  list  of  Hough  buckets  for  each  model.  We  use  our  bound  on  the 
goodness  of  match  to  remove  any  Hough  buckets  without  sufficient  contents  from 
consideration.  Then,  starting  with  the  best  Hough  bucket,  as  measured  over  all 
the  objects,  we  apply  our  constrained  search,  using  premature  termination  to  stop 
when  a  sufficiently  good  interpretation  is  found.  If  such  an  interpretation  is  found 
for  the  current  Hough  bucket,  we  remove  the  edge  fragments  accounted  for  from 
consideration,  adjust  the  contents  of  the  Hough  buckets  for  all  objects,  and  resort 
each  list  of  Hough  buckets  We  then  proceed  as  before,  continuing  until  no  further 
Hough  buckets  remain.  If  no  interpretation  is  found  for  a  Hough  bucket,  we  simply 
move  on  to  the  next  best  bucket  and  continue.  An  example  of  surh  processing  is 
shown  in  Figure  1. 

4.3  Unknown  edge  normals 

In  the  preceeding  discussion,  we  have  assumed  that  we  can  identify  the  correct 
direction  of  the  normals  to  linear  edge  fragments,  and  the  pointing  direction  of  the 
circular  arcs.  This  operat  ion  relies  on  know  ing  t  he  emit  rast  between  the  background 
and  the  objects  If  such  information  i*  available,  the  contrast  across  an  intensity 
edge  will  determine  these  properties 

In  many  eases,  however,  it  is  unreasonable  to  assume  that  this  information  will 
be  known.  We  ran  extend  our  system  to  deal  with  tins  case.  One  solution  is  based 
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on  the  following  observation,  and  has  been  reported  in  jGrimson  and  Lozano-Perez 
87].  As  long  as  two  edges  do  not  cross  or  are  not  collinear.  at  least  one  edge  must  be 
completely  within  one  of  the  half  planes  bounded  by  the  other.  As  a  consequence, 
the  components  along  one  of  the  edge  normals  of  all  possible  separation  vectors 
will  always  have  the  same  sign.  Given  a  tentative  pairing  of  two  measured  edge 
fragments  and  two  model  edges,  we  can  use  this  property  to  choose  the  sign  of  one 
of  the  normals.  The  angle  constraint  can  then  be  used  to  consistently  select  the 
signs  for  other  edges  in  that  interpretation.  This  does  allow  the  method  to  correctly 
interpret  data  with  unknown  data  edge  normal  signs,  at  a  small  increase  in  the 
search  cost. 

A  second  solution  is  simply  to  double  the  number  of  sensory  linear  edge  frag¬ 
ments,  one  with  each  possible  sign  of  the  edge  normal.  The  constraints  will  then 
ensure  that  at  most  one  of  each  such  pair  of  edge  fragments  in  included  in  the  in¬ 
terpretation,  and  the  process  can  proceed  as  before.  Here,  the  pointing  direction 
constraint  is  not  used. 

4.4  Other  extensions 

Although  we  have  presented  the  system  as  recognizing  objects  from  their  occluding 
boundaries,  it  is  more  broadly  applicable  than  this.  In  particular,  since  we  use  an 
edge  detector  to  extract  our  primitives  for  matching,  other  object  markings,  such 
as  albedo  or  material  changes,  or  surface  texture,  that  are  stable  across  a  range  of 
imaging  conditions  would  also  suffice. 

Three  dimensional  objects  that  are  known  to  be  in  stable  positions  can  also 
be  handled  using  this  method.  For  each  stable  position,  we  can  build  an  object 
model  by  running  the  front  end  of  the  system.  The  assumption  of  stable  position 
removes  the  effects  of  perspective,  and  allows  us  to  treat  the  problem  as  essentially 
a  two-dimensional  one. 


5.  Testing 


We  have  implemented  and  tested  a  version  of  the  curved  object  recognition  sys¬ 
tem.  Our  implemented  version  differs  slightly  from  the  description  given  above. 
In  particular,  we  have  not  included  any  of  the  cross  constraints,  relying  only  on 
the  constraints  between  segments  of  the  same  type.  Our  expectation  is  that  the 
non-inclusion  of  such  constraints  should  at  worst  increase  the  search  time  spent  in 
finding  correct  interpretations,  without  causing  any  incorrect  interpretations  to  he 
found 

We  I.  ave  run  I  lie  system  on  a  sequence  of  images  similar  to  that  shown  in  Fig¬ 
ure  I  Kaeh  image  consisted  of  six  overlapping  parts,  selected  with  repetition  from 


two  different  types  of  parts,  and  placed  at  random,  with  possible  mirror  reversals, 
as  shown.  In  each  case,  we  asked  the  system  to  find  as  many  interpretations  as  pos¬ 
sible  from  the  library  of  parts,  where  each  part  could  appear  an  arbitrary  number  of 
times.  After  each  interpretation  was  found,  the  accounted  for  edges  were  removed 
from  the  data,  and  the  process  was  continued,  until  no  further  portions  of  the  search 
space  remained  unaccounted  for. 

For  each  image,  the  system  was  run  in  three  different  settings,  using  perimeter 
percentage  thresholds  of  .10,  .20  and  .50.  For  each  such  triple  of  settings,  the  system 
was  run  with  two  different  tesselations  of  the  Hough  space.  In  the  coarse  case, 
the  Hough  sampling  was  50  pixels  in  the  translation  components  (where  the  entire 
image  was  576  by  454)  and  36  degrees  along  the  rotation  axis.  In  the  fine  case,  the 
Hough  sampling  was  25  pixels  in  the  translation  components  and  18  degrees  along 
the  rotation  axis.  Over  5  trials,  the  system  had  the  performance  indicated  in  the 
following  table. 


Coarse 

Hough 

Fine 

Hough 

Perimeter  % 

.50 

.20 

.10 

.50 

.20 

.10 

Correct 

2.8 

5.2 

4.8 

2.6 

5.2 

5.4 

Multiple 

0.8 

0.8 

1.2 

Mirror 

.  02  . 

0.6 

0.2 

mm 

Incorrect 

0.8 

0.2 

0.2 

Perimeter 

BUR 

.41 

Real  Nodes 

741 

473  _J 

585 

867 

754 

576 

Real  Model  Tests 

236 

149 

203 

344 

341 

245 

Final  Nodes 

1942 

1989 

3252 

-■  ■ 

1879 

2385 

6029 

Final  Model  Tests 

627 

_ 

745 

_ 

1571 

787 

1143 

290 

Parsing  Time 

290 

290 

Parsing  Time 

135 

686 

Search  Time 

90 

106  135 

75 

398 

434 

%  Search  in  Final  Stage 

.72 

.81 

.85 

68 

.76 

91 

Each  of  the  columns  of  the  table  indicates  the  results  of  using  a  different  thresh¬ 
old  on  the  percentage  of  the  perimeter  of  an  object  needed  for  a  valid  interpretation 
The  correct  line  indicates  the  mean  number  of  correct  interpretations  found  over 
the  set  of  trials.  The  maximum  number  of  valid  interpretations  is  (i  per  trial  The 
incorrect  line  indicates  the  mean  number  of  incorrect  interpretations  found  per 
trial  We  also  indicate  the  mean  number  of  multiple  interpretations,  that  is.  situa¬ 
tions  in  which  the  system  found  nearly  identical,  correct  interpretations,  based  on 
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different  subset?  of  data,  and  we  indicate  the  mean  number  of  incorrect  interpreta¬ 
tions  involving  the  mirror  reversal  of  an  object. 

Note  that  in  the  case  of  a  perimeter  percentage  of  .20,  the  system  found  almost 
all  of  the  possible  correct  interpretations.  Each  of  the  incorrect  interpretations 
involved  the  larger  object  in  Figure  1,  in  which  the  circular  structure  was  correctly 
matched,  but  at  the  wrong  orientation.  Since  the  circle  contributes  a  large  amount 
to  the  total  object  perimeter,  only  a  small  number  of  other  edges  were  needed  to 
find  a  feasible  but  incorrect  match.  The  interpretations  that  were  not  found  all 
involved  the  small  object  shown  in  Figure  1,  and  in  all  cases,  the  object  was  heavily 
occluded.  In  the  case  of  a  perimeter  percentage  of  .50  all  the  found  interpretations 
were  correct.  In  the  case  of  a  perimeter  percentage  of  .10,  the  performance  degraded 
slightly,  with  more  incorrect  or  mirror  interpretations.  This  is  not  surprising,  since 
we  are  only  requiring  10%  of  the  object  to  be  matched  in  situations  involved  a 
reasonable  amount  of  clutter.  The  perimeter  line  of  the  table  indicates  the  average 
percentage  of  the  object’s  perimeter  actually  included  in  the  interpretation. 

The  real  nodes  line  and  the  real  model  tests  line  indicate  the  mean  number 
of  nodes  of  the  interpretation  tree,  and  the  mean  number  of  model  transformation 
tests  performed  for  each  of  the  interpretations  found.  The  final  nodes  and  final 
model  tests  lines  indicate  the  amount  of  search  performed  after  the  last  interpre¬ 
tation  was  found  in  each  trial.  Not  surprisingly,  these  numbers  are  much  higher, 
since  considerably  more  effort  is  involved  in  verifying  that  no  further  interpretations 
can  be  found  using  the  remaining  scattered  data  fragments. 

The  Time  lines  in  the  table  indicate  the  mean  time  involved  in  parsing  the 
intensity  edges  into  linear  and  circular  fragments,  in  transforming  these  fragments 
into  the  Hough  space,  and  in  executing  the  actual  search  process.  The  times  are 
reported  in  seconds  of  elapsed  time  for  an  implementation  on  a  Symbolics  Lisp 
Machine,  without  floating  point  hardware.  The  final  line  indicates  the  portion  of 
the  search  time  that  was  spent  in  verifying  that  no  further  interpretations  remained. 
These  timing  statistics  are  intended  only  for  comparative  purposes.  A  number  of 
optimizat  ions  of  the  code  are  possible,  and  would  considerably  reduce  t  hese  numbers. 
For  instance,  in  the  Hough  transformation,  we  are  using  a  very  fine  sampling  of 
rotation  angles,  yielding  a  large  number  of  nearly  overlapping  polygons  in  Hough 
space,  which  are  then  intersected  with  the  buckets  of  the  Hough  space.  Considerable 
savings  could  be  obtained  by  using  a  coarser  sampling  at  the  expense  of  possibly 
missing  a  feasible  Hough  bucket  on  occasion.  Similarly,  in  the  parsing  of  the  input 
data,  we  are  using  an  exhaustive  search  to  find  the  best  estimate  of  the  radius  and 
center  of  the  circular  fragments.  This  accounts  for  80%  of  the  time  reported,  and 
could  clearly  be  sped  up. 

There  are  several  interesting  points  about  the  described  testing  First,  note 
that  most  of  the  search  is -.pent  in  verifying  t  hat  no  flirt  her  interpret  at  ions  exist .  In 
general,  t  lie  correct  ini  crprcl  at  ions  are  found  with  very  little  search.  This  suggests 
t  hat  t  lie  sy  stem  in  fact  behaves  as  a  hy pot  hesize-and-tesl  system,  in  which  t  he  Hough 
t  ransforin  serves  to  hy  pot  hesize  possible  in  I  erpret  at  ions,  t  hat  are  t  hen  verified  by  t  he 
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constraint,  satisfaction  process.  We  note  t  bat  the  Hough  transform  is  not  sufficient, 
alone,  as  we  have  frequently  observed  that  the  biggest  Hough  bucket  did  not  result  in 
a  correct  interpretation.  Moreover,  there  can  be  considerable  diffusion  in  the  Hough 
space,  due  to  the  errors  in  the  sensory  data.  As  a  consequence,  a  large  number  of 
Hough  buckets  may  have  comparable  sized  contents.  This  is  illustrated  in  Figure  6, 
which  shows  t fie  Hough  space  for  one  of  the  objects  of  Figure  1,  at  two  different 
resolutions. 


Second,  we  note  that  if  the  system  can  correctly  identify  most  of  t fie  objects 
in  the  scene,  then  it  would  seem  that  it  should  not  have  to  spend  considerable 
time  in  additional  search  verifying  that  no  further  interpretations  exist,  since  in 
general  most  of  the  data  fragments  should  already  be  accounted  for.  As  can  be 
seen  from  the  table,  however,  considerable  effort  is  spent  in  doing  this.  In  part, 
this  follows  from  the  fact  that  the  interpretations  found  by  the  system  frequently 
do  not  account  for  all  the  data  fragments  arising  from  the  object.  This  occurs 
for  several  reasons.  First,  the  (0-s)  space  segmentation  scheme  does  not  produce 
canonical  partitions  of  the  input  data.  Hence,  small  deviations  in  the  image  may 
cause  a  noticably  different  data  segmentation,  and  some  data  edges  may  be  different 
enough  from  the  model  to  be  excluded  from  the  interpretation.  Our  experience 
suggests  that  this  sensitivity  may  be  more  true  of  the  (0-s)  space  segmentation 
than  of  strictly  polygonal  segmentations.  This  sensitivity  to  segmentation  could 
be  handled  by  increasing  the  error  bounds  discussed  in  the  next  section.  This 
is  dangerous,  however,  since  the  increased  bounds  are  also  likely  to  cause  more 
accidental  alignments  of  data  fragments  to  be  incorrectly  interpreted.  A  better 
solution  would  be  to  do  additional  verification  in  the  image  space.  That  is,  having 
found  a  correct  interpretation  based  on  moderate  error  bounds,  one  could  then 
project  the  interpretation  back  into  the  image,  and  using  looser  bounds  search  for 
additional  data  fragments  that  are  in  agreement  with  the  projected  object  position. 


Many  of  the  incorrect  interpretations  involved  solutions  in  which  the  large  cir¬ 
cular  hole  of  the  large  object  shown  in  Figure  I  was  matched  correctly,  but  the 
overall  orientation  of  the  solution  was  incorrect.  Since  there  is  an  inherent  ambigu¬ 
ity  in  the  rotation  of  the  object  about  the  center  of  the  hole,  while  at  the  same  time, 
the  perimeter  of  the  hole  contributes  a  large  portion  of  the  overall  perimeter,  if  a 
small  portion  of  the  object  happens  to  align  accidentally  with  some  data  fragment, 
we  can  obtain  an  incorrect  interpretation  that  accounts  for  a  noticeable  portion  of 
the  object's  perimeter.  We  need  some  means  of  handling  this  problem,  perhaps  by 
using  a  variant  of  the  Feature  Focus  method  of  Holies  198U 


Finally,  note  that  the  different  samplings  of  the  Hough  space  did  not  lead  to 
significantly  different  performances  m  terms  of  the  number  of  interpretations. 


Figure  6.  Samplings  of  the  Hough  space.  The  top  part  shows  the  Hough  space  for  the 
smaller  object  of  Figure  1,  sampled  at  a  coarse  resolution.  Each  frame  represents  a  different 
orientation,  increasing  from  left  to  right.  Within  each  frame,  the  contents  of  each  of  the 
translation  buckets  is  shown,  with  the  size  of  the  dot  indicating  the  size  of  the  buckets 
contents.  Note  the  smearing  of  this  measure  over  a  broad  ex'“r>t  of  the  space.  The  bottom 
part  shows  the  same  Hough  space  at  a  finer  resolution. 


6.  Free  parameters  and  error  bounds 

In  describing  our  recognition  system,  we  have  used  a  number  of  free  parameters 
and  error  bounds.  While  at  first  glance  there  appear  to  be  a  large  number  of  such 
free  parameters,  in  fact,  many  of  them  are  interrelated,  and  only  a  few  need  to  be 
determined  in  order  to  run  the  system. 

The  first  parameter  is  the  bound  on  the  accuracy  of  measuring  the  position 
of  an  edge  point  <p.  This  bound  is  a  function  of  the  camera  system  and  the  edge 
detector  used.  Since  we  are  using  a  Marr-Hildreth  operator,  the  accuracy  of  the 
system  could  be  determined  from  formal  analysis  Berzins  84  ,  or  could  be  measured 
empirically.  Note  that  since  this  is  simply  an  upper  bound,  we  can  be  conservative 
in  our  estimates. 

Given  this  bound  ((, .  a  number  of  the  other  free  parameters  follow  directly.  For 
example,  suppose  that  the  position  of  an  edge  point  is  known  to  within  the  error 
bound  If  Lmtn  is  a  lower  bound  on  the  length  of  the  edges,  it  is  straightforward 
to  show  that  the  maximum  error  in  the  measured  angle  between  edge  normals  is 
given  by 


The  bound  on  measuring  the  length  of  a  linear  fragment  is  also  determined  by 
in  particular,  the  worst  case  bound  is  given  by 

2c,,. 

The  error  in  measuring  the  radius  of  a  circular  arc  ir  w  ill  generally  be  on  the 
order  of  the  error  in  measuring  the  position  of  an  edge  point  Since  the  radius  is 
determined  by  taking  the  slope  of  a  line  in  0-s  space,  it  is  likely  to  be  less  than  this, 
but  using  is  a  conservative  bound.  The  error  in  measuring  position  of  the  renter 
of  a  circular  arc  fcj  will  also  typically  be  bounded  by  (p.  Similarly,  a  conservative 
bound  on  the  error  in  measuring  the  swept  angle  range  is  (c  —  (a. 

The  bound  on  the  split  and  merge  algorithm  <sm  is  something  that  we  must 
set  by  hand.  Note  that  so  long  as  our  models  are  built  using  the  same  value  of  the 
parameter  as  that  used  in  processing  sensory  data,  and  so  long  as  this  value  is  not 
too  large,  the  exact  value  is  not  critical. 

Setting  the  parameter  that  distinguishes  straight  lines  from  circular  arcs,  f/, 
can  be  done  based  on  properties  of  th°  objects  to  be  recognized.  In  particular,  since 
4  is  a  bound  on  the  angle  between  the  horizontal  axis  and  a  line  in  0-s  space,  if 
the  radius  of  the  largest  circular  arc  on  any  object  is  then  we  can  set 


Thus,  the  various  error  bounds  in  the  algorithm  can  be  determined  by  mea¬ 
suring  the  accuracy  of  the  system  in  determining  the  position  of  an  edge  point, 
by  specifying  the  minimum  length  required  for  an  edge  fragment  and  by 

specifying  the  maximum  radius  of  a  circular  an 
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There  i;  one  other  threshold  in  our  system,  namely  the  threshold  used  to  de¬ 
termine  an  acceptably  sized  interpretation.  We  have  indicated  that  our  measure 
of  an  interpretation  is  the  sum  of  the  lengths  of  the  linear  data  fragments  in  the 
interpretation  plus  the  sum  of  the  arc  lengths  of  the  circular  data  fragments  in  the 
interpretation.  We  use  a  threshold  on  this  measure  in  two  places:  to  remove  small 
Hough  buckets  from  the  search  process,  and  to  prematurely  terminate  the  IT  search 
once  an  acceptable  match  is  found.  I  nfo,  t iinately.  there  does  not  seem  to  be  any 
principled  way  of  setting  this  threshold  Clearly,  we  ran  trade  of  false  positives  and 
false  negatives  by  varying  it.  since  the  smaller  the  threshold,  the  more  likely  an 
incorrect  interpretation  is  accepted,  while  the  larger  tin  threshold,  the  more  likely 
that  correct  interpretations  will  be  missed  In  our  experiments,  we  have  typically 
left  the  threshold  at  20  25(x  of  the  total  perimeter  of  an  object. 

Aldo  note  that  a  straightforward  application  of  a  threshold  on  perimeter  ig¬ 
nores  information  about  what  portions  of  the  object  are  matched.  For  example,  an 
interpretation  accounting  for  .25  percent  of  the  object,  but  in  which  all  .25  percent 
came  from  one  end  of  the  object,  may  be  less  reliable  than  an  interpretation  in  which 
the  .25  percent  is  spread  out  over  the  perimeter  of  t lie-  object. 


7.  Relation  to  previous  work 


The  literature  on  object  recognition  systems  is  extensive,  and  stretches  over  a  period 
of  at  least  twenty  years.  Of  the  variety  of  different  techniques  examined,  a  number 
of  authors  have  taken  a  similar  view  to  ours  that  recognition  can  be  structured  as 
an  explicit  search  fur  a  match  between  data  elements  and  model  elements  Ayache 
and  Faugeras  86.  Baird  85,  Bolles  and  Cam  82.  Holies.  Horaud  and  Hannah  83, 
Browse  87,  Drumheller  87,  Faugeras  and  Hebert  83.  -ion  and  Lozano- Perez  84. 
Goad  83.  Kalvin  et  al.  86,  Knoll  and  Jain  85.  Lowe  xt>.  Murray  87.  Bollard  et  al. 
87.  Schwartz  and  Sharir  87.  Stockman  and  Ksteva  81  Of  these,  the  work  of  Bolles 
and  his  colleagues.  Faugeras  and  his  colleagues,  and  that  of  Baird  are  closest  to  the 
approach  presented  here. 

The  interpretation  tree  approach  is  an  instance  of  the  consistent  labeling  prob¬ 
lem  that  has  been  studied  extensively  in  computer  vision  and  artificial  intelligence 
Waltz  75.  Montanari  74,  Mackworth  77.  Freuder  78.  82.  Haralick  and  Shapiro  79. 
Haralick  and  Elliott  80,  Mackworth  and  I  render  h~>  Tin-  paper  can  be  viewed  as 
suggesting  a  particular  consistency  relation  (the  constraints  on  distances  angles, 
and  radii)  and  exploring  its  performance  An  alternative  approach  to  the  solution 
of  consistent  labeling  problem-  i-  the  use  of  relaxation  \  number  of  authors  have 
investigated  this  approach  to  object  recognition  \v.uhe  and  I  aiigt  ra-  82  Bhanu 
and  laugeras  84.  Davis  79.  KutkoWski  et  al  x  I .  Itiitkowski  82  These  techniques 
are  more  suitable  for  implementation  on  parallel  machines. 
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The  use  of  a  0-s  spare  or  some  equivalent  to  extract  represent  at  ions  of  curved 
laminar  objects  has  been  previously  investigated.  Perkins  78.  SO  describes  a  system 
using  similar  representational  fragments,  extracted  from  a  0-s  space,  as  well  as  some 
simple  constraints  for  determining  potential  matches  These  are  then  evaluated 
using  cross-correlation  in  0-s  space.  Other  systems  that  use  0-s  space  to  partition 
input  data  into  segments  include  Harrow  and  Popplestone  71.  Clemens  86,  Martin 
and  Aggarwal  79.  McKee  and  Aggarwal  77.  Turney  et  at  85  The  Curvature  Primal 
Sketch  developed  by  Asada  and  Brady  86  .  and  used  in  a  recognition  system  by 
Ettmger  87  also  uses  an  explicit  representation  of  changes  in  the  edge  contours  as 
a  basis  for  matching  objects. 
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